PoDiGG: A Public Transport RDF Dataset Generator

نویسندگان

  • Ruben Taelman
  • Ruben Verborgh
  • Tom De Nies
  • Erik Mannens
چکیده

A large amount of public transport data is made available by many different providers, which makes rdf a great method for integrating these datasets. Furthermore, this type of data provides a great source of information that combines both geospatial and temporal data. These aspects are currently undertested in rdf data management systems, because of the limited availability of realistic input datasets. In order to bring public transport data to the world of benchmarking, we need to be able to create synthetic variants of this data. In this paper, we introduce a dataset generator with the capability to create realistic public transport data. This dataset generator, and the ability to configure it on different levels, makes it easier to use public transport data for benchmarking with great flexibility.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An RDF Dataset Generator for the Social Network Benchmark with Real-World Coherence

Synthetic datasets used in benchmarking need to mimic all characteristics of real-world datasets, in order to provide realistic benchmarking results. Synthetic RDF datasets usually show a significant discrepancy in the level of structuredness compared to real-world RDF datasets. This structural difference is important as it directly affects storage, indexing and querying. In this paper, we show...

متن کامل

Apples and Oranges: A Comparison of RDF Benchmarks and Real RDF Datasets

The widespread adoption of the Resource Description Framework (RDF) for the representation of both open web and enterprise data is the driving force behind the increasing research interest in RDF data management. As RDF data management systems proliferate, so are benchmarks to test the scalability and performance of these systems under data and workloads with various characteristics. In this pa...

متن کامل

Evaluating SPARQL 1.1 Property Path Support

With the release of SPARQL 1.1 in 2013 property paths were introduced, which make it possible to describe queries that do not explicitly define the length of the path that is traversed within an RDF graph. Already existing RDF stores were adapted to support property paths. In order to give an insight on how well the current implementations of property paths in RDF stores work, we introduce a be...

متن کامل

EvoGen: a Generator for Synthetic Versioned RDF

Synthetic data are widely used for evaluation, testing, and experimentation. However, there is a lack of systems, tools and datasets that can be used for benchmarking in the context of evolution. In the case of RDF, generation of synthetic data that change through time must take into account evolving paradigms and characteristics that make sense, rather than arbitrary insertions and deletions o...

متن کامل

Evaluating and Analyzing Inconsistent RDF Data in a Semantic Dataset: EMAGE Dataset

This paper explains how to evaluate and analyse inconsistent Resource Description Framework (RDF) data by using EMAGE semantic (RDF) dataset as its use case. The author exploits the sub graph matching powers and mathematical functions of SPARQL query in evaluating inconsistent RDF data in a semantic dataset. He also proposes a mathematical method for calculating the amount of inconsistency in R...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017